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Sir: 



Appellants are submitting the present appeal brief within two months of the Notice 
of Appeal, which was filed on September 15, 2003. 

Real Party in Interest 

Hewlett-Packard Company is the owner of the present invention under the 
assignment executed on September 18, 2000 and recorded on September 27, 2000. 

Related Appeals and Interferences 



None 
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Status of Claims 

Claims 1-5 and 7-17 are the subject of this appeal. No other claims are pending. 
Claim 6 was previously cancelled. 

Status of Amendments 

No amendment was filed subsequent to the final rejection. 

Summary of Invention 

The first aspect of a method of searching a database to find documents similar to a 
query document according to the present invention is recited in independent claim 1 . The 
second aspect of the present invention is recited in independent claim 14. These aspects 
are best depicted in Figures 2 and 3 and summarized from page 2, line 16 to page 3, line 
2 of Appellant's specification. These aspects include a layout data type. The layout data 
type is best described from page 7, line 1 1 to page 8, line 14 with reference to Figure 3 of 
Appellant's specification. 

Generally, a method of searching a document, such as a database representative of 
content on the World Wide Web, to find documents similar to a query document, 
involves a step of decomposing the query document into elements of different data types. 
After this, for one or more of the elements in a first data type, a first data type similarity 
search is conducted to return match results from the database for the one or more 
elements in the first data type. For one or more of the elements in a second data type, a 
second data type similarity search is conducted to return match results from the database 
for the one or more elements in the first data type. The match results from the different 
data types are combined with an appropriate weighting to provide query document match 
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results. Data types can include text, picture, and graphics, and also the layout of the 
overall document. 

A first aspect of the invention provides a method of searching a database to find 
documents similar to a query document. The query document is decomposed into 
elements of different data types, including a layout data type indicating the arrangement 
of the different data types within the query document. For one or more of the elements in 
a first data type, a first data type similarity search is conducted to return match results 
from the database for the one or more elements in the first data type. For one or more of 
the elements in a second data type, a second data type similarity search is conducted to 
return match results from the database for the one or more elements in the second data 
type. The match results from the first data type similarity search and the second data type 
similarity search are combined to provide query document match results. 
Advantageously, results from each query document match may be combined to allow 
progressive refinement of queries using any of the data types either singly or in further 
combination. 

In a second aspect, the invention provides a method of searching a database to find 
documents similar to a query document. The query document is decomposed into 
elements of different data types. A layout element in a layout datatype is determined 
from the spatial arrangement of the elements in the document. For the layout element, a 
layout similarity search is conducted to return match results from the database for the 
layout element. 

In addition to the separate elements provided by the page decomposition shown in 
Figure 1 (graphic 11, text block 12, and picture 13), further information is provided in the 
arrangement of the different elements within the document. As in shown in Figure 3, a 
further output available from page decomposition is a data type plan 3 1 representing the 
document as a line art block, a text block, and an image block, arranged vertically in 
sequence - decomposition into layouts is discussed in US Patent No. 6,002,798. 
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However, the present inventors have appreciated that this data type plan can itself be 
used as a layout data type. This allows yet another element - the layout data type 
element - to be used in search 32 of a database (provided that layout information is 
available in or derivable from the database entries). 

Layout similarity searching, whether used on its own or as one of the elements in a 
combined search as described in the first aspect of the invention, is more powerful if a 
number of different data types are used for text and for overall document type. Using a 
rule-based approach, different text blocks and whole documents, especially in the case of 
formal workflow documents, can be assigned particular functions with relatively high 
confidence. For example, it is well known that isolated text blocks at the top of a page 
and handwriting at the bottom are suggestive of a letter, and so different spatial regions of 
the document can be assigned to appropriate functional fields (address, letter text etc) - 
likewise, table and currency totals in a document can be identified as a discrete element, 
and their presence limits the document to another group (bill, quote or invoice). Layout 
searching can thus involve matching to templates representing different workflow 
document types (thus promoting matching of a document determined to be a letter against 
other letters). An appropriate mechanism is to normalise a layout for size, orientation and 
skew, and then carrying out an "exclusive or" operation on the query element and the 
layout records in the database - this will be effective provided that all records involved 
have a broadly common format. 

Issues 

Issue 1 — Whether claims 1-5 and 7-13 are patentable under 35 U.S.C. § 103(a) over 
U.S. Patent No. 6,243,713 Bl to Nelson ("Nelson") in view of U.S. Patent No. 6,460,036 
Bl to Herz ("Herz"). 

Issue 2 — Whether claims 14-17 are patentable under §103 (a) over the combination 
of Nelson and Herz. 



4 



Serial No. 09/647,266 



Art Unit: 2172 



Grouping of Claims 

Claims 1-5 and 7-13 form Group I and claims 14-17 form Group II. For each group 
of rejections that appellant contests herein that applies to more than one claim, such 
additional claims, to the extent separately identified and argued below, do not stand or 
fall together. 

Argument 

The Examiner's reading of Nelson in the Final Office Action betrays a 
misunderstanding of the nature of the claimed phrase "layout data type". 

Generally, Nelson describes a two phase process consisting of an indexing phase 
102 and a retrieval phase 104. The indexing phase 102 produces a multimedia index 140. 
The retrieval phase 104 selects and scores documents from the multimedia index 140 
using queries, such as a query to find the word "sunset" within 10 words of a picture of a 
sunset. (Nelson, abstract, col. 5 line 9 to col. 7 line 67). This is best shown in Figure 2. 

Nelson describes an indexing process in which "various multimedia components of 
a compound document are identified as to their types and their positions within the 
document...." and are "converted into one or more tokens, each with additional reference 
data. A token represents an abstraction of the multimedia component, and the reference 
data preferably describes the position of the multimedia component in the document (e.g. 
its character position or offset).... The various tokens . . . represent different types of 
multimedia data." (Nelson, col. 3, lines 21-24, lines 29-32, and lines 47-49). 

In Nelson, multimedia data types are the type of multimedia, e.g. text, image, 
audio, and video. (Nelson, col. 5, lines 52-55). However, Nelson has no data type to 

5 

\ 
\ 



Serial No. 09/647,266 



Art Unit: 21,72 



represent the layout of a document, such as a letter, a bill, a quote, or an invoice. 
(Appellant's specification, page 8, lines 4-8). 

The distinction between a multimedia data type and a "layout data type" appears to 
be one which the Examiner found rather subtle and missed. A "layout data type" is a 
higher level of abstraction than a multimedia data type. Nelson breaks a document into 
parts with each part having a multimedia data type, while the claimed invention in 
addition to breaking the document into parts, also considers the whole, i.e. the type of 
document by using a "layout data type". This provides many advantages over Nelson, 
including searches that properly represent the full document and more efficient searching 
by narrowing searches to documents having a particular type of layout. 

In addition, Appellants have carefully reviewed Herz and cannot find any 
disclosure of such a "layout data type". 

Issue 1 — Whether claims 1-5 and 7-13 are patentable under §103(a) over the 
combination of Nelson and Herz 

Claim 1 recites, inter alia, "decomposing the query document into different data 
types, including a layout data type indicating the arrangement of the different data types 
within the query document". The Office Action appears to wrongly equate these 
elements of claim 1 with Nelson's multimedia data type and position by citing several 
figures and sections in Nelson. (Office Action, page 3, para. 4). 

First, the Office Action cites Figure 3, which shows the multimedia query "(sailboat 
or ship or <picture> /shape) w/10 (sunset or sunrise or <picture> / color)". This query 
searches for the word "sailboat" or the word "ship" or a picture having a ship's shape 
within 10 words of the word "sunset" or the word "sunrise" or a picture having certain 
colors. In Figure 3, "the query 150 includes both text 151 and image 157 components, 
and a number of query operators 152 defining both logical relationships 152 and 
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proximity relationships 156 between the multimedia components." (Nelson, col. 6, lines 
44-48). In other words, the query searches for text or pictures within 10 words of each 
other that have certain values. This is not the same as a query including a layout data 
type. The claimed layout data type indicates the arrangement of the whole query 
document. For example, a layout data type of "letter" where the arrangement includes a 
date at the top, handwriting at the bottom, and other spatial regions assigned to functional 
fields (address block, body text, etc.) is distinctly different from a Nelson-type query that 
would describe each component of a letter in terms of its multimedia data type and 
position in the document. (Appellant's specification, page 8, lines 3-10). How would a 
Nelson-type query indicate the relative position of the signature to, say, the date? The 
number of words between them would depend on the length of the letter, so unless the 
letter were of determinate length it would not work. Also, a Nelson-type query is not 
guaranteed to find only letters. 

Just after the section cited by the Examiner, Nelson states "Once the compound 
query is input by the user, it is separated into its multimedia components, as during 
indexing." (Nelson, col. 6, line 66 to col. 7, line 1). Figure 3 shows a compound query. 
(Nelson, col. 6, lines 35-37, and lines 43-44). The fact that the compound query is 
separated into its multimedia components is clear evidence that the compound query in 
Figure 3 is a very different thing from the claimed "layout data type". 

Next, the Office Action cites Figure 4, which "is a flowgraph of the process of 
separating a compound document into multimedia components". (Nelson, col. 4, lines 
27-28). The output of Figure 4 is "the ordered list 440 of multimedia components", such 
as text, image, video, and audio. (Nelson, Figure 4, elements 100 and 440, col. 8, lines 
55-56). The claimed "layout data type indicating the arrangement of the different data 
types within the query document" is not the same as Nelson's ordered list of multimedia 
components. Nelson's list simply lists the parts of a document, while the claimed "layout 
data type" considers the format or arrangement to determine the type of the whole 
document. 
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The Office Action cites Figure 5, which shows "a further flow-graph of one simple 
method of locating multimedia components within a document." (Nelson, col. 9, lines 9- 
1 1). Figure 5 is described in Nelson at col. 9, which the Office Action partially cites. 
(Nelson, col. 9 lines 18-53). Nelson discloses a method of using tags in documents, such 
as a M {\pict M tag in an RTF document that indicates a picture. The method results in 
identifying the type of multimedia component and its location in the document. The 
claimed method step "decomposing the query document into different data types, 
including a layout data type indicating the arrangement of the different data types within 
the query document" is not the same identifying the multimedia type and position of each 
component, because the "layout data type" considers the whole arrangement and the type 
of the document, not just the parts, and has many advantages over the Nelson method. 

Finally, the Office Action cites "Compound documents are separated into 
constituent multimedia components of different data types, such as text, images, video, 
audio/voice, and other data types." (Nelson, col. 5, lines 52-55). The claimed "layout 
data type" is not the same as the Nelson multimedia data type, which indicates whether a 
component is text, image, video, etc. The "layout data type" indicates the arrangement of 
the different data types within the query document, considering the whole arrangement, 
not just the parts, which has many advantages over the Nelson data type. 

In addition, Appellants have carefully reviewed Herz and cannot find any 
disclosure of the claimed "layout data type". 

Thus, the combination of Nelson and Herz does not teach or suggest every element 
of claim 1. Therefore, Appellants respectfully request reversal of the rejection of claim 1. 

Claims 2-5 and 7-13, which depend directly or indirectly from claim 1, are 
considered allowable by virtue of their dependencies, because they inherit the patentable 
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subject matter of claim 1. Therefore, Appellants respectfully request reversal of these 
rejections as well. 

Issue 2 — Whether claims 14-1 7 are patentable under §1 03(a) over the 
combination of Nelson and Herz 

Claim 14 recites, inter alia, "determining a layout element in a layout datatype 
from the spatial arrangement of the elements in the document; and for the layout element, 
conducting a layout similarity search to return match results from the database for the 
layout element". The Office Action appears to wrongly equate these elements of claim 
14 with performing a search query as shown in Figure 3 and described in col. 6, lines 35- 
65. (Office Action, page 6). 

Figure 3 shows the multimedia query "(sailboat or ship or <picture> /shape) w/10 
(sunset or sunrise or <picture> / color)". This query searches for the word "sailboat" or 
the word "ship" or a picture having a ship f s shape within 10 words of the word "sunset" 
or the word "sunrise" or a picture having certain colors. In Figure 3, "the query 150 
includes both text 151 and image 157 components, and a number of query operators 152 
defining both logical relationships 152 and proximity relationships 156 between the 
multimedia components." (Nelson, col. 6, lines 44-48). In other words, the query 
searches for text or pictures within 10 words of each other that have certain values. This 
is not the same and not as useful as the claimed "conducting a layout similarity search to 
return match results from the database for the layout element" where the claimed "layout 
element" indicates "the spatial arrangement of elements in the document". For example, 
in a "letter" spatial arrangements include a date at the top, handwriting at the bottom, and 
other spatial regions assigned to functional fields (address block, body text, etc.). 
(Appellant's specification, page 8, lines 3-10). The spatial regions are groupings based 
on layout and logical relationships of parts to the whole, as opposed to Nelson's index 
based on whether something is text or a picture or simply its line number in the 
document. Layout searching thus involves matching to templates representing different 
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workflow document types (thus promoting matching of a document determined to be a 
letter against other letters). (Appellant's specification, page 8, lines 8-10). The claimed 
"layout similarity search" considers the whole layout of the document, while Nelson only 
considers the individual parts. A "layout similarity search" is more concise and efficient 
than Nelson's searching for each isolated part. A Nelson-type search more primitively 
searches for certain words and pictures at certain positions in a document, while the 
claimed invention searches documents that look like letters for those words and pictures. 

Neither Figure 3 nor the description of Figure 3 in Nelson, describes the claimed 
step of "determining a layout element in a layout data type from the spatial arrangement 
of the elements in the document." This is a separate step in claim 14 from "conducting a 
layout similarity search to return match results from the database for the layout element", 
even though the Examiner rejects them together. (Office Action, page 6). In Nelson, 
there are indexing and retrieval steps. (Nelson, abstract) Figure 3 and its description in 
col. 6 are part of retrieval, as indicated by "Multimedia retrieval operates as follows" at 
the start of the paragraph describing figure 3. (Col. 6, line 35). Appellants have also 
carefully reviewed the indexing steps in Nelson and cannot find "determining a layout 
element in a layout datatype from the spatial arrangement of the elements in the 
document". 

In addition, Appellants have carefully reviewed Herz and cannot find any 
disclosure of the claimed "layout similarity search" and "layout datatype" in claim 14. 

Thus, the combination of Nelson and Herz does not teach or suggest every element 
of claim 14. Therefore, Appellants respectfully request reversal of the rejection of claim 
14. 

Claims 15-17, which depend directly or indirectly from claim 14, are considered 
allowable by virtue of their dependencies, because they inherit the patentable subject 
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matter of claiml4. Therefore, Appellants respectfully request reversal of these rejections 
as well. 

Conclusion 

For the extensive reasons advanced above, Appellant respectfully but forcefully 
contends that each claim is patentable. Therefore, reversal of all rejections is respectfully 
requested. 



Respectfully submitted, 



ii 





Date Paul D. Greeley, 

Reg. No. 31019 
Attorney for the Applicant 
Ohlandt, Greeley, Ruggiero & Perle, L.L.P. 
One Landmark Square, 10 th Floor 
Stamford, CT 06901-2682 
Tel: 203-327-4500 
Fax: 203-327-6401 
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Appendix 

1 . (Previously amended) A method of searching a database to find documents similar 
to a query document, comprising: 

decomposing the query document into different data types, including a layout data 
type indicating the arrangement of the different data types within the query document; 

for one or more of the elements in a first data type, conducting a first data type 
similarity search to return match results from the database for the one or more elements 
in the first data type; 

for one or more of the elements in a second data type, conducting a second data 
type similarity search to return match results from the database for the one or more 
elements in the first data type; and 

combining the match results from the first data type similarity search and the 
second data type similarity search with the layout data type to provide query document 
match results. 

2. (Original) A method as claimed in claim 1 , wherein one of the data types is 
representative of text. 

3. (Original) A method as claimed in claim 2, wherein a plurality of the data types are 
representative of text, separate data types of the plurality being representative of different 
functional blocks of text. 

4. (Previously amended) A method as claimed in claim 1, wherein one of the data 
types is representative of pictorial images. 

5. (Previously amended) A method as claimed in claim 1 , wherein one of the data 
types is representative of graphical images. 

6. (Cancelled) 
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7. (Previously amended) A method as claimed in claim 1 , wherein the step of 
similarity searching to return match results is carried out, separately, for a plurality of 
elements having between them more than two data types. 

8. (Previously amended) A method as claimed in claim 1, where all features of a 
common data type in the document are treated as one element. 

9. (Previously amended) A method as claimed in claim 1, wherein spatially distinct 
features of a common data type in the document are treated as separate elements. 

10. (Previously amended) A method as claimed in claim 1, wherein elements are user 
selectable or deselectable for the step of similarity searching. 

1 1 . (Previously amended) A method as claimed in claim 1 , wherein the similarity 
searching results for separate elements are weighted before combination. 

12. (Original) A method as claimed in claim 1 1, wherein said weighting is user 
selected. 

13. (Original) A method as claimed in claim 11, wherein said weighting is attributed 
according to a determined significance of each relevant element in the document. 

14. (Previously amended) A method of searching a database to find documents similar 
to a query document, comprising: 

decomposing the query document into elements of different data types; 

determining a layout element in a layout datatype from the spatial arrangement of 
the elements in the document; and 

for the layout element, conducting a layout similarity search to return match results 
from the database for the layout element. 
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15. (Original) A method as claimed in claim 14, wherein the layout similarity search 
involves searching against templates representative of different document types. 

16. (Original) A method as claimed in claim 14, wherein the elements include 
elements of separate data types representative of different functional blocks of text. 

17. (Previously amended) A method as claimed in claim 14, wherein the elements 
include elements of data types representative of images. 
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